AITopics | Chino Hills

Collaborating Authors

Chino Hills

Business Listing Classification Using Case Based Reasoning and Joint Probability

Sood, Sanjay (AT&T) | Kar, Parijat P. (AT&T)

AAAI ConferencesMar-19-2011

One challenge of building and maintaining large-scale data management systems is managing data fusion from multiple data sources. Often times, different data sources may represent the same data element in a slightly different way. These differences may represent an error in the data or a disagreement between sources on the correct value that best represents the data point. When the quantity of data managed and fused becomes sufficiently large, manual review becomes impossible, and automated systems must be built to manage data fusion. Some of the traditional solutions use simple voting theory, Dempster-Shafer theory, fuzzy matching and incremental learning. This paper presents a novel approach to data fusion in the domain of business listings. The task at hand, business listing categorization, suffers from conflicting and incomplete data from disparate data sources. Given the need for a high degree of accuracy in this task, we use a combination of case-based reasoning, joint probability, and domain-specific rules to improve data accuracy above other methods.

artificial intelligence, category, machine learning, (16 more...)

AAAI Conferences

2011 AAAI Spring Symposium Series

Country:

North America > United States > California > San Bernardino County > Chino Hills (0.04)
North America > United States > California > San Bernardino County > Chino (0.04)
North America > United States > California > Los Angeles County > Glendale (0.04)

Genre:

Research Report (0.34)
Overview (0.34)

Industry:

Media > Film (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.98)

Add feedback